Search CORE

223 research outputs found

Adaptive approximate Bayesian computation for complex models

Author: CC Drovandi
D Wegmann
D Wegmann
Franck Jabot
Guillaume Deffuant
MA Beaumont
Maxime Lenormand
MGB Blum
P Glynn
P Marjoram
P Moral Del
P Moral Del
SA Sisson
T Toni
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Approximate Bayesian computation (ABC) is a family of computational techniques in Bayesian statistics. These techniques allow to fi t a model to data without relying on the computation of the model likelihood. They instead require to simulate a large number of times the model to be fi tted. A number of re finements to the original rejection-based ABC scheme have been proposed, including the sequential improvement of posterior distributions. This technique allows to de- crease the number of model simulations required, but it still presents several shortcomings which are particu- larly problematic for costly to simulate complex models. We here provide a new algorithm to perform adaptive approximate Bayesian computation, which is shown to perform better on both a toy example and a complex social model.Comment: 14 pages, 5 figure

arXiv.org e-Print Archive

Crossref

HAL Descartes

Non-linear regression models for Approximate Bayesian Computation

Author: A. Butler
A. Gelman
B. Schölkopf
B.D. Ripley
C. Gourieroux
C.M. Bishop
C.P. Robert
D.A. Nix
D.E. Reich
E.A. Nadaraya
G. Weiss
G.E.P. Box
G.S. Watson
I.J. Wilson
J. Fan
J. Hey
J.H. Friedman
J.K. Pritchard
J.K. Pritchard
J.P. King
J.S. Liu
K. Heggland
L.A. Zhivotovsky
L.A. Zhivotovsky
M. Stephens
M. Tanaka
M.A. Beaumont
M.D. Shriver
M.K. Kuhner
N.J.R. Fagundes
O. Ratmann
P. Bortot
P. Marjoram
P. Marjoram
P.J. Diggle
S. Tavaré
S. Tavaré
S.A. Sisson
T. Ohta
T. Toni
V.N. Vapnik
W. Härdle
Y.-X. Fu
Y.-X. Fu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 23/02/2009
Field of study

Approximate Bayesian inference on the basis of summary statistics is well-suited to complex problems for which the likelihood is either mathematically or computationally intractable. However the methods that use rejection suffer from the curse of dimensionality when the number of summary statistics is increased. Here we propose a machine-learning approach to the estimation of the posterior density by introducing two innovations. The new method fits a nonlinear conditional heteroscedastic regression of the parameter on the summary statistics, and then adaptively improves estimation using importance sampling. The new algorithm is compared to the state-of-the-art approximate Bayesian methods, and achieves considerable reduction of the computational burden in two examples of inference in statistical genetics and in a queueing model.Comment: 4 figures; version 3 minor changes; to appear in Statistics and Computin

arXiv.org e-Print Archive

Crossref

Choosing summary statistics by least angle regression for approximate Bayesian computation

Author: Andreas Futschik
Beaumont M.A.
Blum M.G.B.
Breiman L.
Hudson R.R.
Ijaz Hussain
Joyce P.
Marjoram P.
Mitwali Abd-el.Moemen
Muhammad Faisal
Nunes M.A.
Publication venue: 'Informa UK Limited'
Publication date: 01/02/2016
Field of study

YesBayesian statistical inference relies on the posterior distribution. Depending on the model, the posterior can be more or less difficult to derive. In recent years, there has been a lot of interest in complex settings where the likelihood is analytically intractable. In such situations, approximate Bayesian computation (ABC) provides an attractive way of carrying out Bayesian inference. For obtaining reliable posterior estimates however, it is important to keep the approximation errors small in ABC. The choice of an appropriate set of summary statistics plays a crucial role in this effort. Here, we report the development of a new algorithm that is based on least angle regression for choosing summary statistics. In two population genetic examples, the performance of the new algorithm is better than a previously proposed approach that uses partial least squares.Higher Education Commission (HEC), College Deanship of Scientific Research, King Saud University, Riyadh Saudi Arabia - research group project RGP-VPP-280

Crossref

Bradford Scholars

Methods for detecting associations between phenotype and aggregations of rare variants

Author: B Li
BE Madsen
Chul Joo Kang
Fan Yang
International Schizophrenia Consortium
LA Almasy
M Mitchell
P Visscher
Paul Marjoram
T Manolio
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Although genome-wide association studies have uncovered variants associated with more than 150 traits, the percentage of phenotypic variation explained by these associations remains small. This has led to the search for the dark matter that explains this missing genetic component of heritability. One potential explanation for dark matter is rare variants, and several statistics have been devised to detect associations resulting from aggregations of rare variants in relatively short regions of interest, such as candidate genes. In this paper we investigate the feasibility of extending this approach in an agnostic way, in which we consider all variants within a much broader region of interest, such as an entire chromosome or even the entire exome. Our method searches for subsets of variant sites using either Markov chain Monte Carlo or genetic algorithms. The analysis was performed with knowledge of the Genetic Analysis Workshop 17 answers

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Simulation-based model selection for dynamical systems in systems and population biology

Author: Addy
Arbouzova
Beaumont
Blum
Burnham
Chen
Clancy
Darnell
Del Moral
Eigen
Excoffier
Gelman
Gillespie
Grelaud
Gutenkunst
Horvath
Jeffreys
Kass
Kim
Kirk
Klingmüller
Köster
Longini
Marjoram
May
Michael P. H. Stumpf
Möller
Müller
O'Neill
Prusiner
Ratmann
Ratmann
Secrier
Sisson
Swameye
Timmer
Tina Toni
Toni
Vyshemirsky
Wei
Wilkinson
Zi
Publication venue
Publication date: 01/01/2010
Field of study

Computer simulations have become an important tool across the biomedical sciences and beyond. For many important problems several different models or hypotheses exist and choosing which one best describes reality or observed data is not straightforward. We therefore require suitable statistical tools that allow us to choose rationally between different mechanistic models of e.g. signal transduction or gene regulation networks. This is particularly challenging in systems biology where only a small number of molecular species can be assayed at any given time and all measurements are subject to measurement uncertainty. Here we develop such a model selection framework based on approximate Bayesian computation and employing sequential Monte Carlo sampling. We show that our approach can be applied across a wide range of biological scenarios, and we illustrate its use on real data describing influenza dynamics and the JAK-STAT signalling pathway. Bayesian model selection strikes a balance between the complexity of the simulation models and their ability to describe observed data. The present approach enables us to employ the whole formal apparatus to any system that can be (efficiently) simulated, even when exact likelihoods are computationally intractable.Comment: This article is in press in Bioinformatics, 2009. Advance Access is available on Bioinformatics webpag

arXiv.org e-Print Archive

Crossref

PubMed Central

University of Melbourne Institutional Repository

Bayesian Parameter Estimation for Latent Markov Random Fields and Social Networks

Author: Andrieu C.
Andrieu C.
Beaumont M. A.
Besag J.
Besag J.
Besag J.
Caimo A.
Carter C.
Del Moral P.
Frank O.
Friel N.
Geyer C. J.
Geyer C. J.
Green P. J.
Grelaud A.
Hamze F.
Higdon D. M.
Koskinen J. H.
Marjoram P.
Murray I.
Murray I.
Møller J.
Neal R.
Pritchard J. K.
Propp J. G.
Richard G. Everitt
Robert C. P.
Sisson S. A.
Snijders T. A. B.
Tierney L.
Wasserman S.
Publication venue
Publication date: 01/01/2012
Field of study

Undirected graphical models are widely used in statistics, physics and machine vision. However Bayesian parameter estimation for undirected models is extremely challenging, since evaluation of the posterior typically involves the calculation of an intractable normalising constant. This problem has received much attention, but very little of this has focussed on the important practical case where the data consists of noisy or incomplete observations of the underlying hidden structure. This paper specifically addresses this problem, comparing two alternative methodologies. In the first of these approaches particle Markov chain Monte Carlo (Andrieu et al., 2010) is used to efficiently explore the parameter space, combined with the exchange algorithm (Murray et al., 2006) for avoiding the calculation of the intractable normalising constant (a proof showing that this combination targets the correct distribution in found in a supplementary appendix online). This approach is compared with approximate Bayesian computation (Pritchard et al., 1999). Applications to estimating the parameters of Ising models and exponential random graphs from noisy data are presented. Each algorithm used in the paper targets an approximation to the true posterior due to the use of MCMC to simulate from the latent graphical model, in lieu of being able to do this exactly in general. The supplementary appendix also describes the nature of the resulting approximation.Comment: 26 pages, 2 figures, accepted in Journal of Computational and Graphical Statistics (http://www.amstat.org/publications/jcgs.cfm

arXiv.org e-Print Archive

Central Archive at the University of Reading

CiteSeerX

Crossref

Warwick Research Archives Portal Repository

MSMC and MSMC2: the multiple sequentially markovian coalescent

Author: 1001 Genomes Consortium
AS Malaspinas
CM Hung
GAT McVean
L Pagani
L Pagani
LAF Frantz
M Malinsky
M. Raghavan
P Marjoram
S Mallick
S Schiffels
TM Beissinger
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

The Multiple Sequentially Markovian Coalescent (MSMC) is a population genetic method and software for inferring demographic history and population structure through time from genome sequences. Here we describe the main program MSMC and its successor MSMC2. We go through all the necessary steps of processing genomic data from BAM files all the way to generating plots of inferred population size and separation histories. Some background on the methodology itself is provided, as well as bash scripts and python source code to run the necessary programs. The reader is also referred to community resources such as a mailing list and github repositories for further advice

Crossref

MPG.PuRe

ABCtoolbox: a versatile toolkit for approximate Bayesian computations

Author: C Leuenberger
CCK Boyce
Christoph Leuenberger
D Wegmann
Daniel Wegmann
G Hamilton
G Heckel
G Laval
G Weiss
JK Pritchard
JM Cornuet
JS Lopes
K Thornton
L Excoffier
Laurent Excoffier
M Beaumont
M Chadeau-Hyam
M Currat
M Schweizer
M Tenehaus
MA Beaumont
MA Beaumont
O Ratmann
P Bortot
P Marjoram
P Marjoram
RR Hudson
S Braaker
S Fink
S Tavaré
SA Sisson
SA Sisson
Samuel Neuenschwander
T Toni
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

BACKGROUND: The estimation of demographic parameters from genetic data often requires the computation of likelihoods. However, the likelihood function is computationally intractable for many realistic evolutionary models, and the use of Bayesian inference has therefore been limited to very simple models. The situation changed recently with the advent of Approximate Bayesian Computation (ABC) algorithms allowing one to obtain parameter posterior distributions based on simulations not requiring likelihood computations. RESULTS: Here we present ABCtoolbox, a series of open source programs to perform Approximate Bayesian Computations (ABC). It implements various ABC algorithms including rejection sampling, MCMC without likelihood, a Particle-based sampler and ABC-GLM. ABCtoolbox is bundled with, but not limited to, a program that allows parameter inference in a population genetics context and the simultaneous use of different types of markers with different ploidy levels. In addition, ABCtoolbox can also interact with most simulation and summary statistics computation programs. The usability of the ABCtoolbox is demonstrated by inferring the evolutionary history of two evolutionary lineages of Microtus arvalis. Using nuclear microsatellites and mitochondrial sequence data in the same estimation procedure enabled us to infer sex-specific population sizes and migration rates and to find that males show smaller population sizes but much higher levels of migration than females. CONCLUSION: ABCtoolbox allows a user to perform all the necessary steps of a full ABC analysis, from parameter sampling from prior distributions, data simulations, computation of summary statistics, estimation of posterior distributions, model choice, validation of the estimation procedure, and visualization of the results

Crossref

Springer - Publisher Connector

Serveur académique lausannois

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Bern Open Repository and Information System (BORIS)

Using DNA Methylation Patterns to Infer Tumor Ancestry

Author: ARA Anderson
Darryl Shibata
H Enderling
JL Tsao
KD Siegmund
KD Siegmund
KD Siegmund
Kimberly D. Siegmund
MA Beaumont
MR Lacey
P Gerlee
P Nicolas
Paul Marjoram
PC Nowell
RG Abbott
Xiaoyu Zhang
Y Yatabe
You Jin Hong
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Background: Exactly how human tumors grow is uncertain because serial observations are impractical. One approach to reconstruct the histories of individual human cancers is to analyze the current genomic variation between its cells. The greater the variations, on average, the greater the time since the last clonal evolution cycle (‘‘a molecular clock hypothesis’’). Here we analyze passenger DNA methylation patterns from opposite sides of 12 primary human colorectal cancers (CRCs) to evaluate whether the variation (pairwise distances between epialleles) is consistent with a single clonal expansion after transformation. Methodology/Principal Findings: Data from 12 primary CRCs are compared to epigenomic data simulated under a single clonal expansion for a variety of possible growth scenarios. We find that for many different growth rates, a single clonal expansion can explain the population variation in 11 out of 12 CRCs. In eight CRCs, the cells from different glands are all equally distantly related, and cells sampled from the same tumor half appear no more closely related than cells sampled from opposite tumor halves. In these tumors, growth appears consistent with a single ‘‘symmetric’ ’ clonal expansion. In three CRCs, the variation in epigenetic distances was different between sides, but this asymmetry could be explained by a single clonal expansion with one region of a tumor having undergone more cell division than the other. The variation in one CRC was complex and inconsistent with a simple single clonal expansion

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

A Simulated Annealing Approach to Approximate Bayes Computations

Author: Andreas Scheidegger
B Andresen
C Leuenberger
Carlo Albert
G Ruppeiner
G Weiss
Hans R. Künsch
JM Marin
L Onsager
MA Beaumont
MH Rubin
MM Tanaka
N Metropolis
P Fearnhead
P Marjoram
P Salamon
RD Wilkinson
S Tavaré
T Toni
U Seifert
W Spirkl
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Approximate Bayes Computations (ABC) are used for parameter inference when the likelihood function of the model is expensive to evaluate but relatively cheap to sample from. In particle ABC, an ensemble of particles in the product space of model outputs and parameters is propagated in such a way that its output marginal approaches a delta function at the data and its parameter marginal approaches the posterior distribution. Inspired by Simulated Annealing, we present a new class of particle algorithms for ABC, based on a sequence of Metropolis kernels, associated with a decreasing sequence of tolerances w.r.t. the data. Unlike other algorithms, our class of algorithms is not based on importance sampling. Hence, it does not suffer from a loss of effective sample size due to re-sampling. We prove convergence under a condition on the speed at which the tolerance is decreased. Furthermore, we present a scheme that adapts the tolerance and the jump distribution in parameter space according to some mean-fields of the ensemble, which preserves the statistical independence of the particles, in the limit of infinite sample size. This adaptive scheme aims at converging as close as possible to the correct result with as few system updates as possible via minimizing the entropy production in the system. The performance of this new class of algorithms is compared against two other recent algorithms on two toy examples.Comment: 20 pages, 2 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref